HTTXPlugin

HTML to text conversion plugin for AWeb
by Gabriele Favrin



THE AUTHOR IS IN NO WAY RESPONSIBLE FOR ANY DAMAGE THAT THESE PROGRAMS MAY CAUSE.

Redistribution or modification is forbidden. If you improve these scripts send me the new version at favrin@tin.it.



With the HTTXPlugin you can use HTTX from within AWeb to save (or print) HTML pages as text.

This version of the interface requires AWeb II 3.1 or better and HTTX 1.7 or better. Also, the command "Delete" must be present in C: directory.



Installation



Configuration

HTTX must be configured before use. Load HTTXPluginCFG.awebrx with AWeb to set the various options:

HTML conversion related configuration

Default Directory
Default directory to save the converted files to.

Line length
Maximum length for each line. (option LEN)

Indentation
Indentation in spaces of <UL>, <OL> or <DL> lists. (option INDENT)

ANSI conversion of HTML styles and LINKS.
Conversion of HTML styles and LINKS (HREF and NAME) to ANSI sequences and optimization of alignment functions. (option ANSI)

Conversion of HTML entities to 7 bit ascii
Conversion of HTML entities (accent letters, and so) to ASCII codes lower than 128. (option 7BIT)

HR draw mode
Line drawing for <HR> tag. (option HRMODE=nn)

Ignore text and HR alignment
Don't center or right justify text and lines. (option NOALIGN)

Save Title as FileNote
Save the document title (<TITLE>) as the output files file comment. Only used when saving as text. (option FILENOTE)

Suppress Title/URL
Suppresses insertion of document Title (<TITLE>) and URL in output file.

Print HTML Link References
Add HTML link references to the converted file. (option HREF)

Print Image Alternate Texts
Add the ALT-text of images (<IMG>) to the output file. (option IMG)

Print JavaScript scripts
Add the text (NOT result!) of JavaScript (or other scripts) to converted file. (option SCRIPT)

The content of <SCRIPT> is left unchanged. No wordwrap, entities conversion, or TAGS parsing is done at all. This may cause problems with some type of printers.

Refer to HTTX documentation for more informations.

Support badly written HTML
Partial support for badly written HTML documents. Use with caution! (option BADHTML)

The standard HTTX settings will be saved in the environment variables HTTX.prefs (for Save as text) and HTTXprt.prefs (for Print as text).

Other options (like NOHEADER, FORCE or INCLUDE) are managed directly from the Plugin and should NOT be put in the configuration used with this interface. Refer to HTTX documentation for a full explanation of options.

Plugin related configuration

Print HTTP/META header
Add the HTTP header information and the <META> tags present in the HTML source to the converted file.

Example of the added text:

Header:
HTTP/1.0 200 OK Server: Netscape-Enterprise/2.01
Date: Sat, 26 Apr 1997 06:38:12 GMT
Accept-ranges: bytes
Last-modified: Tue, 15 Apr 1997 13:57:46 GMT
Content-length: 7085
Content-type: text/html
Content-Type: text/html; charset=ISO-8859-1

Show preview of converted document
After a document is converted it will be shown using MultiView (must be in "SYS:Utilities/" drawer). After closing MultiView a requester will ask if the document should be kept or deleted. Select Delete and the file will be erased and no conversion informations will be shown.

Note that some versions of MultiView seems to have problems with some of the ANSI codes used by HTTX. If part of text not indented, don't worry, it is MultiView. See HTTX documentation for more informations.

This option is only used when saving as text.

Show conversion report
After converting a document, an info page containing the current HTTX settings is normally shown. You can choose between this and a requester that reports only if conversion has been successful.

The HTTX-AWeb settings are saved in the environment variable HTTXPlugin.config.


Saving preferences for HTTX works like the standard WB preference programs. Use the "Save" or "Use" option to store data. With "Save" the settings are stored in ENVARC: and the changes are permanent. With "Use" they are saved only in ENV: so changes are lost when you reboot. The first time you configure HTTX you MUST use "Save" option.



Usage

To save or print a page use the previously set menu or popup entries.

If page contains frames (or iframes) a requester will open. You may choose to save all frames to a single file or use a requester to save or skip each frame.

If preview is active when saving, after conversion of any document, the converted document will be shown and then the keep or delete requester will be shown.